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Lossless data embedding. 



FIELD OF THE INVENTION 

The invention relates to a method and arrangement for losslessly embedding 
data In a host signal. The invention also relates to methods and arrangements for retrieving 
the data and reconstructing the host signal. 

5 

BACKGROUND OF THE INVENTION 

An undesirable side effect of many watermarking and data-hiding schemes is 

that the composite signal (e.g> images, video, audio) into which the auxiliary data has been 

embedded is distorted. Finding an optimal balance between the amount of embedded data and 
10 the induced distortion is therefore an active field of research. There has been considerable 

progress in understanding the fundamental limits of the capaoity versus distortion aspect of 

watermarking and data-hiding schemes. 

Sometimes, it is not only desired to embed data with little distortion, but also 

to be able to remove said distortion completely. A data embedding scheme providing such 
15 capability is referred to as a lossless or reversible data-hiding or embedding scheme. Lossless 

data-hiding schemes are important in cases where no degradation of the original host signal is 

allowed. This is for example true for medical imagery and multimedia arobives of valuable 

original works. 

A known lossless data hiding method is disclosed in is disclosed in Jessica 
20 Fridrich, Miroslav Goljan and Rui Du, "Lossless DataEmbedding for all Image Formats", 
Proceedings of SPIE, Security and Watermarking of Multimedia Contents, San Jose, 
California, 2002. In this known method, a feature or subset B of signal X (e.g. the least 
significant bit plane of abitmap image, or the least significant bits of specific DCT 
coefficients of a JPEO image) is extracted from the signal X and subjected to lossless 
25 compression. The compressed subset B is concatenated with auxiliary data (payload) and 
inserted into the signal X inplace of toe original subset. The method is based on the 
assumption that the subset B can (i) be losslessly compressed, and (ii) randomized while 
preserving the perceptual quality of signal X 
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At the receiver end, the distorted composite signal can be reproduced using 
conventional equipment In order to remove the distortion completely, the concatenated bit 
stream comprising the compressed subset is extracted and decompressed. The original subset 
B is subsequently reinserted into the signal X; 
5 The Fridrioh et al. article discloses practical examples of lossless data-hiding, 

but pays little attention to the theoretical limits of lossless embedding schemes. 

OBJECT AND SUMMARY OF THE INVENTION 

It is an object of the invention to provide lossless data embedding schemes that 
10 are more efficient in a rate versus distortion sense. 

To this end, the invention provides a method and arrangement for embedding 
auxiliary data in a host signal, the method comprising the steps of: using a predetermined 
data embedding method having a given embedding rate and distortion to produce a composite 
signal; using a portion of said embedding rate to accommodate restoration data identifying 
15 the host signal conditioned on said composite signal; and using the remaining embedding rate 
ft>r embedding said auxiliary data. 

The invention exploits the insight that it suffices for a teceiver to remove the 
uncertainty of the original host signal, given the received composite signal. The amount of 
data, which is required to remove said uncertainty is smaller than the amount of data, which 
20 is required to encode the original host signal itself. The inventors have also formulated the 
theoretical boundaries of lossless data embedding capacity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows diagrams representing the boundaries of lossless data embedding 

25 schemes. 

Fig. 2 shows schematically a diagram of an arrangement for lossless 
embedding auxiliary data in a host signal in accordance with the invention. 

Fig. 3 shows diagrams illustrating the performance of embodiments of lossless 
data embedding arrangements in accordance with the invention. 
30 Fig, 4 shows a schematic diagram of an arrangement for reconstructing a host 

signal in accordance with the invention. 

Figs. 5 and 6 illustrate embodiments of accommodating restoration data in a 
host signal in accordance with the invention. 
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Figs. 7 aid 8 show diagrams illustrating the difference between symmetrical 
and asymmetrical channels. 

DESCRIPTION OF PREFERRED EMBODIMENTS 

5 The prior art compression and bit replacement scheme will be discussed more 

generally first The signal source of Fridrich et al. produces a sequence of signal samples, for 
example, me pixels of an image. The subset B of the signal being compressed (a bit plane, 
least significant bits of specific DCT coefficients) constitutes a source of binary symbols 
xi..7&j. It will be assumed that the probabilities p(HPr{x-0} andpi=Pr{x=l} are not equal, i.e. 
10 theentropy H(p 0 )— Pol°ga(Po)-Pll°S2(Pl) ofthe source is less than 1. Ia that case, the 
information theory teaches mat the sequence of N symbols can be compressed into a shorter 
sequence yi ..y K of K«NxH(po) symbols. A reversible data hiding scheme is now obtained by 
appending N*(l-H(p<>)) auxiliary data symbols to the sequence yi.-yx. For example, if po=0.$> 
andpi=0.1, the entropy of the source is H(p 0 )«0.47, So that (for large N) only 0.47*Nbite are 
15 needed to represent the original host symbols. Accordingly, 0.53*N auxiliary data symbols 
oaube embedded as payload into the remainder of the sequence y^y*. At the decoder end, 
the original sequence xi..x N is restored by decompressing yi..yx. The remainder y K+ i..yN of 
the sequence is interpreted as auxiliary data. 

The data rate of the Fridrich et al. embedding scheme is R«l-H(po) 
20 bits/sample. As the bits of the compressed sequence yi..y K are uncorrected with those of 

XlM x N , and the auxiliary data are randomly chosen, one easily sees that me distortion between 
xi!.xjand yi,.yN is D-0.5. The distortion of the Fridrich et al. scheme can bereduced by 
performing the construction above on only a fraction a of me symbols in xi-.Jfc. This is 
referred to as time-sharing. Both the data rate and the distortion then decrease by the factor a. 
25 The resulting data rate and distortion of this "simple" time-sharing embedding scheme are 
R=a(l-H(po)) D~a/2, respectively, ok 

R^ mple (D)^D(l-H<po)) (1) 
For ptf=0.9, mis linear rate-distortion function is shown in Fig. I as a dash-dotted line 11. 

The inventors have found that linear equation (1) is not optimal. They have 
30 found theoretical bounds on me capacity of lossless data embedding. More particularly, the 
achievable data rate R rBV of a reversible embedding scheme for a memoryless binary source 
and p<£0.5 is, for 0<D<0.5: 

R rev « H(max(po ~D. 0.5))-H(p 0 ) < 2) 
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For p=0,9, this rate-distortion function is shown in Pig. I as a solid line 12. Equation (2) is 
generally applicable to asymmetric channels (the inventors use the notion "channels" for data 
embeddets). For symmetrical channels, the rate is: 

R*ym =H(p 0 +a-2p 0 )D)-H(po) (3) 
5 For ptf=0.9, this rate-distortion function is shown in Fig. 1 as a da$hed line 13. The 

embedding rate for a symmetrical channel is always between the optimal embedding rate and 
the time-sharing embedding rate. Practical examples of symmetrical and asymmetrical 
channels will be given later. The lines 11, 12 and 13 in Fig, 1 relate to p 0 -0.9 (and pi=0.1). 
For illustration, similar lines 14, 15 and 16 a*e also shown for p 0 =0.8 r 

10 . ^ Fig. 2 shows a general schematic diagram of a lossless data embedding 

arrangement in accordance with the invention. The arrangement receives a digital 
representation of a perceptual host signal, for example, an image Im. An extraction stage 21 
extracts therefrom a sequence of host symbols X=={xi..Xn} in which auxiliary data will be 
embedded. Like in the Fridrich et ai. embedding scheme* the host signal can be obtained by 

15 extracting from an image a bit plane or the least significant bits of specific DCT coefficients. 

The arrangement further comprises a data embedder 23, which is conventional 
in the sense that this embedder introduces distortion of the host signal. The "squared error" is 
often used to represent distortion: 

D(x,y) = (y-x) 2 

20 The embedding process produces a composite signal Y={yi„y^}. It will initially be assumed 
that the host signal X and the composite signal Y are binary signals with alphabet {0,1}. The 
composite signal Y is inserted back into the image by an insertion stage 22 to obtain a 
watermarked image Xm s . 

A restoration encoded: 24 receives the host signal X and the composite signal 

25 Y. The restoration encoder maintains a record of which host symbols have undergone which 
modification and encodes said information into restoration data r. The expression fiC which 
host symbols have undergone which modification 54 must be interpreted broadly. If the 
distortion is either DK) or D-l (which is the case in this embodiment), then it suffices to 
identify which symbols have undergone distortion, For other types of embedder 23 s the 

30 amount of distortion must be encoded as well. It should be noted that the restoration encoder 
24 represents a functional feature of the invention. The circuit does not need to be physically 
present as such. In the practical embodiment of the airangement being presented hereinafter, 
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the information as to which symbols have been distorted is inherently produced by the 
embedder 23 itself. 

It will be shown that the restoration data rat© in bite/symbol is smaller than the 
embedding rate of embedder 23. The remaining em h Rdriing capacity is used for embedding 
5 auxiliary data (payload) w. The restoration data r and payload w are concatenated in a 

concatenation circuit 25. It is the concatenated data d which is applied to the embedder 23 for 
embedding. 

In a preferred embodiment of the arrangement, the embedder 23 operates in 
accordance with the teachings of an article by M. van Dijk and F.MJ. Willems, "Embedding 

10 Information in Grayscale Images", Proceedings of the 22 nd Symposium on Information 
Theory in the Benelux, Enschede, The Netherlands, May 15-16, 2001, pp. 147-154. hi this 
article, the authors describe lossy embedding schemes that have an efficient rate-distortion 
ratio. More particularly, a number L (L>1) of host signal samples are grouped together to 
provide a block or vector of host symbols. The host symbols of a block are modified such 

15 that the syndrome of said block represents one or more (but less than L) embedded message 
symbols d. 

The expression "syndrome" is a well-known notion in the field of error 

correction. In error correction sohemes, the syndrome of a received data word is determined 

by multiplying it with a given matrix. If the syndrome is zero, the data word is correct If the 
20 syndrome is unequal to zero, the non-zero value represents the position (or positions) of 

erroneous data word symbols. Hamming error correction codes have Hamming distance 3. 

They allow 1 erroneous data symbol to be corrected. Other codes, such as Golay codes allow 

plural symbols of a data word to be corrected. 

In a mathematical sense, the data embedding method taught by M. van Dijk et 
25 al. resembles error correction, m order to embed a message symbol d in a block of L host 

symbols xt.. XL , the embedder modifies one or more host symbols of said block. 

Mathematically, an output block y,..y L is computed which has the desired syndrome and is 

closest to Xi .xl in aHaromlng-sense. By way of example, data embedding using a Hamming 

code with block length 1>3 will now briefly be summarized. 
30 To compute the syndrome of a block or vector of 3 bits, the vector is 

multiplied with the following 3 x2 parity check matrix: 
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Note that all mathematical operations are modulo-2 operations. For example, the syndrome of 
input vector (001) is (1 1), because 



C J :> • -El 



It is this syndrome (1 1) which represents the embedded data. Obviously, the syndrome of the 
host "vectors is generally not the message to be embedded. One of the host symbols must 
therefore be modified. IT, for example, the message (01) is to be embedded instead of (1 1), 
the emhedder 23 changes the second host symbol so that original host vector (001) is 
modified into (011): 



1 o 3 o 3 

The distortion of this embedding scheme per 3 symbols is — »0 2 +— *1 =— (probability 1/4 

4 4 4 

that none of the host symbols is changed and probability 3/4 that one symbol is changed by 
±1), so that the average distortion per symbol is D=l/4. The embedding rate is 2 bits per 
block, i.e. R=*2/3 bits/symbol. The corresponding (R,D)«pair is shown as a + sign denoted 302 
in Fig, 3. 

In a similar manner, 3 data bits can be embedded in a block of 7 signal 
symbols, 4 bits can be embedded in 15 signal symbols, etc. More generally,, the Hamming 
code based embedding schemes allow m message symbols to be embedded in blocks of 
L=2 m -1 host symbols by modifying at most 1 host symbol. The embedding rate is 
m 



2 m -1 



20 and the distortion is 

Fig, 3 shows the corresponding (R a D)-pairs of this (lossy, irreversible) 
embedding scheme for mFQ 9 3 t .*fi as + signs denoted 302, 303, 306. The (R,D>pair for 
m=4 (which is simple bit replacement) is also shown as + sign denoted 301. Note that the 
25 (R,D) values do not depend on the entropy H(p) of the binary source. Fig. 3 also shows the 
(R,D) pair 300 (R=0,53 bits/symbol, D«0.5) of the Fridricli et si. lossless embedding scheme 
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for po=0.9. The theoretical boundaries 11, 12 and 13 of lossless embedding schemes for 

ptrO.9 (cf. Kg. I) a« also shown in Fig. 3 for reference. 

Inaccordanos wtfo the invention aportionofthe embedded message data bits 

d is now used to identify whether one of the signal samples has been modified and, if so, 
5 which sample mat is. For me Hamming codes with block length 3 (m=2, L=3), there are 4 

possibilities: none of the three host symbols has been changed, the first symbol has been 

modified, the second symbol has been modified, or the third symbol has been modified. If the 

entropy H(p) of the signal source is equal to 1, then all events have equal probabilities. In that 

case, bom embedded message bite per block are required for restoration. However, if the 
10 entropy H(p)ofwe signal source is unequal* l,men the events have dlf&rentpxobabiHties, 

and less than m restoration bits are required. This leaves space to embed 'real' auxiliary data 

bite (also referred to as payload) in the blocks of host symbols. 

' Lik© in the Fridrich et aL example, it will be assumed that p<f=0.9. 

Accordingly, me probabiUty p(x=000) that the source produces host vector (000) is 
l5 (0 9) 3 «0.729. The probability p(x=001) that the source produces host vector (001) is 

(0 9)**(0 I)*"™. ete " A* 8 ** 6 embSdder 23 ° fthe arraaSem6 * t hiB pt ° dUOed * 

composite vector y=000. The original host vector x could have been (000). Jh that case, none 
of the original signal samples has been modified But me original host vector could also have 
been (001), (010), or (100). In that case, one of the host symbols has been modified. The 
20 probabiUty that the host vector was x~000, given the generation of y=000, is: 

PJ^2°1_— — -=0.75 

p(x=000 1 y*0Q0) = - (x?=000) +p( » = 00l) +p(x=010) + p(x=l W) 

In a similar manner, the probabilities that y-000 originates fix>mhost vector (001), (010) or 

(100) can be computed. This yields: 

p(*=001|y=000) = 0.083 

25 p(x=O10|y=G00)^ 0.083 

p(x=400|y=000) = 0.083 

Eachcompostfcvectoryhas^ 

ate summarised in the following Table. The Table also includes, for each block y, the 
corresponding conditional erdropy H(xly). Said conditional entropy represents the uncertainty 
30 ct<^^^^^^r^T^^h^^^^^ 

probabilityp(y), assuming** the messages 00, 01, 10 and 11 have eaual probabilities 1/4. 
For example, the probability p(y~000) has been computed as follows; 
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p(y S =000) = -p(s=000) + ~p(*=001) + ip(x=01 0) + 00) = 0.243 0 



X 


syndrome 


P(x) 


y=000 


y=001 


y=010 


y=Oll y=»IO0 


y=10l 


y-lio 


y=ili 


000 


. 00 


0.729 


0.7500 


0.8804 


0.8804 




0.8804 








001 


11 


0,081 


0.0833 


0.0978 




0.4709 




0.4709 






01O 


10 


0.081 


0.0S33 




0.0978 


0.4709 




0.4709 




Oil 


01 


O.009 




O.0109 


0.0109 


0,0523 






0.3214 


100 


01 


0.031 


0.0833 








0.0978 


0.4709 


0.4709 


101 


10 


0.009 




0.0109 






0.0109 


0.0523 




0.3214 


110 


11 


0,009 






0.0109 




0.0109 




0.0523 


0.3214 


111 


00 


O.O01 








0.0058 




0.0058 


0.0058 


0.0357 






H0c)y)= 1.2075 
pCy)= 0,2430 


0.6316 
0.2070 


0.6316 
0.2070 


1.2891 
0.0430 


0.6316 
0.2070 


1.2891 
0.0430 


1.2891 
0.0430 


1.7506 
0.0070 



The conditional entropy H(X| Y) of the source, averaged over all blocks y, 
5 represents the number of bits to reconstruct x, given y. In the present example, said average 
entropy equals: 

HCX| Y) ~ £p(y)H(x | y) = 0.8642 bits/block 
y 

Accordingly* 0.8642 restoration bits per block are required to identify the original block. This 
leaves 2-0.8642=1 . 1 358 bits/block for embedding payload* The data rate R is thus: 

10 R « 0.3786bits/symbol. 

Note that die distortion D of the composite signal is not affected by the particular meaning 
that has now been assigned to the embedded data d. As described before, the distortion of this 
lossless embedding scheme is: 
D=l/4 

15 The corresponding (RJ>) pair is shown as a 0 sign denoted 312 in Fig. 3. It will be 

appreciated that this lossless embedding scheme has a considerably higher embedding rate R 
than the Fridrich et al, lossless embedding scheme having the same distortion (cf. 302). In a 
similar manner, the rate-distortion pairs for Hamming codes having lengths 7, 1 5 3 3 1, 63, etc. 
can be computed. Fig. 3 shows the corresponding (R,D)~pairs for m=3,.. 9 6 as 0 signs denoted 

20 313, 316, 

Fig, 4 shows a schematic diagram of an arrangement for reconstructing the 
origmal host signal from a received composite signal. The arrangement receives the 
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image Em. It can directly be applied to a reproduction device for display. The arrangement 
further comprises an extraction stage 41, which extracts from the received image the 
composite signal Y={yi..yN} (e.g. a given bit plane) in which the data d has been embedded. 
The extraction stage 41 is identical to the extraction stage 21 of the embedding arrangement 

5 which is shown in Fig. 2. 

The composite signal Y is applied to a data retrieval circuit 43, which retrieves 
the data d being embedded in the composite signal. Ia the preferred embodiment, wherein de 
data has been embedded using Hamming codes of length L, the retrieval circuit 43 
determines the syndrome of each block of symbols yi..yL. The extracted data is a 

1 0 concatenation of payload w and restoration bits r. They are separated in a splitter 44, which 
performs the reverse operation of concatenation circuit 26, which is shown in Fig. 2. The 
payload w is thus retrieved. 

Hie restoration bits r and the composite signal Y are used, by a reconstruction 
unit 45, to reconstruct the original host signal X. The reconstruction unit is arranged to undo 

15 the modifications) applied to the original host signal X=*i..Xn. In the preferred embodiment, 
the restoration data r identifies whether one of the symbols in a block Y has been modified 
and, if so, which symbol that is- hi more general terms, the restoration data identifies the 
distortion D of the symbols yi.-ya. The reconstructed host signal X is finally inserted back 
into the image by an Insertion stage 42 to obtain the original image Im. The insertion stage 42 

20 is identical to the insertion stage 21 of the embedding arrangement which is shown in Fig. 2. 

In the embodiment described above, it has been assumed that the host signal 
X, the composite signal Y, and the data symbols are binary signals with alphabet {0,1 >. 
However, the invention is not restricted to binary signals. For example, a ternary embedding 
scheme as disclosed in the van Dp et al. article may be used as well. In a ternary data 

25 embadder, the data symbols d belong to an alphabet {0,1,2}. More particularly: 
signal sample values y=0,3,6,. . . represent message symbol d = y mod 3=0, 

- signal sample values y=l,4,7,... represent message symbol d - y mod 3 = 1, and 

- signal sample values y=*2,5,8,... represent message symbol d = y mod 3=2. 

The data embedder 23 (see Fig. 2) now receives the original image signal (the 
30 circuits 21 and 22 are redundant), and modifies the least significant portion of a signal sample 
Xi such that the data embedded in modified sample yi is & In a similar manner as described 
for binary embedding, ternary symbols can also be embedded in groups of host symbols. It is 
again possible to do this by using (ternary) Hamming codes or a (ternary) Golay code. 
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Examples thereof are described in Applicant's non-prepublished International patent 
application IB02/01702 (Applicant's docket PHNL010358). 

In yet another data embedding scheme, the message symbols d are embedded 
in pail* of signal samples. In this scheme, the two-dimensional symbol space of signal 
5 samples (x^) is "colored" with 5 colors. Each point on the grid denotes a pair of signal 
samples, and has a color different from its neighbors. The colors are numbered 0..4, and each 
color represents a message symbol d e {0,1,2,3,4} . In this embodiment, the embedder 23 
checks whether (Xa,Xb) has the color d to be embedded. If that is not the case, it changes the 
symbol pair (Xa,Xb) such that the modified pair has the color & It will be appreciated that the 
1 0 two-dimensional embedding scheme can be extended to more dimensions. In a three- 
dimensional grid, for example, each point cannot only be "moved" to the four neighbors in 
the same layer, but also up or down. Seven colors, i,e, seven message symbols, are now 
available. 

Practical embodiments of particular methods of accommodating the 
1 5 restoration data r in the data d to be embedded will now be described. It is to be mentioned 
thereby that the embedding rate R, that can be attained using a given embedder 23 (such as 
R=0.3786 bits/symbol for binary embedding using Hamming codes with block length 3), are 
theoretical boundaries. The embedding rate can be approached for long sequences (large N) 
of host signal samples. 

20 In a first embodiment of the method in accordance with the invention, the host 

signal is divided into large enough segments. The restoration data fbr each segment is 
accommodated in a subsequent segment. The remaining capacity is used for embedding 
payload. This is shown in Fig. 5, where numeral 51 denotes the original host signal Im. The 
signal is divided into segments S(n), each comprising a given number of signal samples (here 

25 image pixels)* Numeral 52 denotes the embedded data stream d in time alignment with the 
signal. As has been illustrated, the restoration bits r(n) for segment S(n) have been embedded 
in segment S(n+1). The remaining portion of segment S(n+1) is used for accommodating 
payload w. Note that the precise number of restoration bits may vary from segment to 
segment It is advantageous to identify the boundary between restoration bits r and payload w 

30 in a segment, for example, by providing each series of restoration bits with an appropriate 
©nd-codo. 

The figures shown* in Fig, 5 are illustrative only. Let the segment length be N 
(hers H ra 3000) signal symbols. The embedder 23 (see Fig. 2) is bassd on Hamming code ^vith 
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RxN (here 2000) bits to be embedded in each segment. The entropy of the source is H(X)Y) 
(here 0.8642/3*0.3 bits per symbol) for a given probability po (here 0.9). The number of 
restoration bits to remove the tmcertainty of segment X, given Y, is HCX|Y)xN (here 0.3 
bits/symbol * 3000 symbols - 900 bits). This lesves R*N-H(X|Y)*N (here 2000-900=1 100) 

> bits for payload. 

Fig. 6 shows an alternative embodiment for accommodating the restoration 
bits, a this embodiment, a segment S(n) with a given initial length is provided with payload 
w only. The restoration bits r(n) for segment S(n) are accommodated in a subsequent segment 
S(n+1). The subsequent segment S(n+1) is now assigned a length that is required to 

0 accommodate the restoration bits r(n). The segment S(n+1) requires a new number of 

restoration bits r(n+l) to be embedded in a yet further segment S(n+2), etc. This process is 
repeated a number of times, e.g. until the subsequent segment is smaller man a given 
ifcreshold. The whole process is men repeated for a new segment SQ with the given initial 
length. 

5 " A data embedder, which turns an input symbol or vector X into a output 

symbol or vector Y represents a "channel". The data embedders described musfer constitute a 
svrametrical channel. This can be seen in Fig. 7, which shows a gf aphical representation of 
the data embedder based on Hamming codes having block length 3 as described before. 
Fig 8 shows the graphical representation of an asymmetrical channel This particular 
10 example is obtained by modifying input vectors (001), (010) and (100) into rdU) instead 
of yKOOO), When d=00 is to be embedded (1 4 s are preferably not changed into 0's). The 
embedding rate of this embedding scheme is R-0.4335 bits/symbol (of. rate RKJ.3786 of me 
corresponding symmetrical channel). Because 2 bite of a vector, instead of 1 bit, are now 
sometimes changed, the distortionis slightly greater. In this case, the distortion isD=0.2701 
25 (of. D-0.25 of the symmetrical channel). Reference numeral 322 in Fig. 3 denotes the 

corresponding (Repair. As can be seen in mis Fig., fcc performance of the asymmetrical 
channel lies between boundary lines 12 and 13. 

The invention can be summarized as follows. An undesirable side effect of 
watermarking or data-hiding schemes is that the host signal is distorted. This invention 

30 discloseaarevers^^ 

additional signaling) reconstruction of the host signal (X). This is achieved by 
accommodating, in the embedded data (d) of the watermarked signal 00, restoration data (r) 
that identifies the host signal given the composite signal, La. the restoration data identifies 
(24) which modifications the host signalhas undergone during embedding (23). The 
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restoration data is accommodated in a portion of Has embedding capacity of a conventional 
embedder (23). The remainder of the capacity is used for embedding payload (w). 
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Claims 

1. A method of embedding auxiliary data in a host signal, comprising the steps 
of; 

5 - using a predetermined data embedding method having a given embedding rate and 
distortion to produce a composite signal; 

- using a portion of said embedding rate to accommodate restoration data identifying the 
host signal conditioned on said composite signal; and 

_ ugiug the remaining embedding rate for embedding said auxiliary data. 

10 

2. A method as claimed in olaim 1, comprising the steps ofi 
_ dividing the host signal in successive segments; 

- applying the predetermined data embedding method to said segments; 

_ acocnnmodating in a segment the restoration data for a previous segment. 

15 . . 

3. A method as claimed in claim 2, wherein each segment comprises the 

restoration data for said previous segment as well as auxiliary data. 

4. A method as claimed in olaim 2, comprising the steps ofi 

20 (a) acjcommodatingaiixuia^ 

(b) aceommodating, in a subsequent segment, restoration data only for the previous 



(o) adaptingthe length of said subsequent segmentto the amount of restoration databeing 
embedded therein; 
25 (d) repeating steps (b) and (c) a predetermined number of times. 

5 A method as claimed in claim 4, wherein said step (d) comprises repealing 

steps (b) and (c) until the length of the subsequent segment is smaller than a predetermined 
threshold. 

30 

6 . An arrangement for embedding auxiliary data (w) in a host signal (X), 



- a predetermined data embedder (23) having a given embedding rate and distortion to 
produce a composite signal (Y) with embedded data (d); 
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- means (24,25) for generating restoration data (r) identifying the host signal (X) 
conditioned on me composite signal (Y); and 

- means (26) for accommodating said restoration data (r) in a portion of said embedded 
data (d) and said auxiliary data (w) in the remaining portion of said embedded data. 

5 

7. A method of reconstructing a host signal from a composite signal representing 

a distorted version of said host signal with data embedded therein, ihe method comprising the 
steps of: 

- retrieving the embedded data from the composite signal; 
10 - splitting the embedded data into restoration data and auxiliary data; 

- reconstructing the host signal using the reconstruction data, given the composite signal 

g. A method as claimed in claim 7, comprising the steps of: 

- dividing the composite signal in successive segments; 
15 - using the restoration data accommodated in a segment for reconstructing a previous 

segment of the host signal. 

9. A method as claimed in claim 8, wherein each segment of the composite 
signal comprises the restoration data for said previous segment of the host signal as well as 

20 auxiliary data. 

10. An arrangement for reconstructing a host signal (X) from a composite signal 
(X) representing a distorted version of said host signal with data (d) embedded therein, 
comprising: 

25 - means (43) for retrieving the embedded data (d) from the composite signal 00; 

_ splitting means (44) for splitting the embedded data (d) into restoration data (r) and 
• auxiliary data (w); 

- reconstruction means (46) for reconstructing the host signal QQ using the reconstruction 
data (r), given the composite signal (Y). 

30 

11. A composite information signal (Y> with embedded data (d) comprising 
restoration data (r) and auxiliary data (w), said restoration data identifying the distortion of a 
hostdpnaLCD conditioned on op id eafSsjcsits "hsrl 
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An undesirable side effect of watermarking or data-hiding schemes is flu* the 
host signal is distorted. Ibis invention discloses a reversible or lossless d*4^ 
thatauo^con^eteatrfblW 
00 ThisisacMevedbyaccon^od^^ 

m restoration data (r) that identifies the host signal given the composite signal, U. the 

LsLtionfctaiden^ 
eld^g^Therestorauondata^ 

^^enfl^ 
embedding payload (w). 
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Fig.4 
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Fig.6 
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